06/06/2023

Example: Interpersonal Interaction

39 patient-therapist dyads

15 min annotated video recordings of therapy session

Possible research questions

  • Can we extract latent interpersonal interaction states over time?
  • What are the dynamics over time in interpersonal interaction states?
  • Do patient-therapist dyads differ in their dynamics over time?
  • Can this between dyad heterogeneity be explained by, for example, depression improvement?

The hidden Markov model

HMM: probabilistic model to infer hidden states \(S_t \in (1, 2, ..., m)\) at

  • each time point \(t = 1, 2, ..., T\)
  • from the observed series \((y_{k1}, y_{k2}, ..., y_{kT})\)
  • for dependent variable \(k = 1, 2, ..., K\)

The hidden Markov model

We assume:

  • observation \(y_{qt}\) is generated by an underlying, latent state \(S_t\),
  • sequence of hidden states \(\{S_t : t = 1, 2, ..., T\}\) forms a Markov chain: probability of switching to the next state \(S_{t+1}\) only depends on the current state \(S_t\): \(P(S_{t_1} | S_t, S_{t-1}, ..., S_1 = P(S_{t+1} | S_t)\)

The hidden Markov model

Two sets of parameters:

  1. probability of transitioning between latent states \(\gamma_{ij}=P(S_{t+1} = j | S_t = i)\)
  2. probability of emitting an observation given the current latent state \(P(y_{kt} | S_t)\)

The hidden Markov model

Given example data, assume

  • categorical emission distribution: \(P(y_{kt} | S_t) \sim Cat(\theta_{ki})\)
  • \(y_{k.}\) are conditionally independent given \(\{S_t : t = 1, 2, ..., T\}\) to accommodate multivariate data

HMM is suited for one sequence of data

Analyzing data from multiple individuals:

  • fit HMM to the data of each individual separately

  • fit one and the same HMM model to the data of all individuals

    • strong assumption: subjects do not differ with respect to HMM parameters
  • explain some of the differences using covariates (e.g., R package depmixS4)

  • Does not allow (quantification of) natural variation (i.e., heterogeneity) between individuals

  • Does not allow for individual specific dynamics over time

Extending the HMM to a multilevel framework

Including a multilevel framework for the HMM, enables simultaneous estimation:

  • group level parameter estimates
  • subject specific parameter estimates for each subject \(n\)

Multinomial logistic regression to estimate transition probabilities \(\gamma_{nij}\) and variable & state dependent categorical probability distributions \(\theta_{nkiq}\):

\(\gamma_{nij} = \frac{\text{exp}(\alpha_{nij})}{1 + \sum_{{j} = 2}^m \text{exp}(\alpha_{ni{j}})} \quad\) and \(\quad {\theta_{nkiq} = \frac{\exp{(\beta_{nkiq})}}{1 + \sum_{l=2}^{Q} \exp{(\beta_{nkil})}}}\)

where \(\alpha_{nij} = \bar{\alpha}_{ij} + \epsilon_{\left[\alpha\right]nij}\) and \(\beta_{nkiq} = \bar{\beta}_{kiq} + \epsilon_{\left[\beta\right]nkiq}\)

Adopt a Bayesian approach

The mHMMbayes package: an example

39 patient-therapist dyads

15 min annotated video recordings of therapy session

Fitting a simple model - input

Fitting a 3 state multilevel hidden Markov model on the nonverbal data:

library(mHMMbayes)

## specifying general model properties:
 # number of states m
m <- 3
 # number of dependent variables
n_dep <- 6
 # number of output categories for each dependent var
q_emiss <- c(3, 2, 2, 3, 2, 2) 

Fitting a simple model - input

## specifying starting values
 # Start values of the TPM
start_TM <- diag(.9, m)
start_TM[lower.tri(start_TM) | upper.tri(start_TM)] <- .05

 # Start values for the emission distribution
start_EM <- list(matrix(c(0.70, 0.20, 0.10, # vocalizing therapist
                           0.05, 0.90, 0.05,
                           0.25, 0.70, 0.05), byrow = TRUE, nrow = m, ncol = q_emiss[1]), 
                  matrix(c(0.15, 0.85, # looking therapist
                           0.35, 0.65,
                           0.30, 0.70), byrow = TRUE, nrow = m, ncol = q_emiss[2]), 
                  matrix(c(0.80, 0.20, # leg therapist
                           0.80, 0.20,
                           0.80, 0.20), byrow = TRUE, nrow = m, ncol = q_emiss[3]) 
                  # continue emission probs here for the three other variables
                 ) 

Fitting a simple model - input

# Run a model without covariate(s) and default priors:
out_3st <- mHMM(s_data = MHMM_nonverbal,
                            gen = list(m = m, n_dep = n_dep, q_emiss = q_emiss),
                            start_val = c(list(start_TM), start_EMc),
                            mcmc = list(J = J, burn_in = burn_in))

Model output - print()

out_3st
## Number of subjects: 39 
## 
## 20000 iterations used in the MCMC algorithm with a burn in of 5000 
## Average Log likelihood over all subjects: -2200.246 
## Average AIC over all subjects: 4460.491 
## 
## Number of states used: 3 
## 
## Number of dependent variables used: 6

Model output - summary()

summary(out_3st)
## State transition probability matrix 
##  (at the group level): 
##  
##              To state 1 To state 2 To state 3
## From state 1      0.936      0.025      0.038
## From state 2      0.029      0.923      0.048
## From state 3      0.074      0.101      0.825
## 
##  
## Emission distribution for each of the dependent variables 
##  (at the group level): 
##  
## $T_Vocalizing
##         Category 1 Category 2 Category 3
## State 1      0.864      0.037      0.099
## State 2      0.026      0.961      0.013
## State 3      0.159      0.794      0.047
## 
## $T_Looking
##         Category 1 Category 2
## State 1      0.044      0.956
## State 2      0.294      0.706
## State 3      0.231      0.769
## 
## $T_Leg
##         Category 1 Category 2
## State 1      0.940      0.060
## State 2      0.929      0.071
## State 3      0.836      0.164
## 
## $P_Vocalizing
##         Category 1 Category 2 Category 3
## State 1      0.014      0.973      0.013
## State 2      0.860      0.023      0.117
## State 3      0.640      0.184      0.176
## 
## $P_Looking
##         Category 1 Category 2
## State 1      0.329      0.671
## State 2      0.067      0.933
## State 3      0.305      0.695
## 
## $P_Leg
##         Category 1 Category 2
## State 1      0.661      0.339
## State 2      0.979      0.021
## State 3      0.454      0.546

State composition - obtain_emiss()

emiss_pop <- obtain_emiss(out_3st)

State dynamics - obtain_gamma()

gamma_pop <- obtain_gamma(out_3st)


State dynamics - obtain_gamma()

gamma_pop <- obtain_gamma(out_3st)


State dynamics at dyad level - obtain_gamma()

gamma_subj <- obtain_gamma(out_3st, level = "subject")


Explaining heterogeneity - including covariate(s)

# Run a model WITH covariate(s) and default priors:
out_3st_cov <- mHMM(s_data = MHMM_nonverbal,
                            gen = list(m = m, n_dep = n_dep, q_emiss = q_emiss),
                            xx = c(list(xx_depr), vector(mode = "list", length = n_dep)),
                            start_val = c(list(start_TM), start_EMc),
                            mcmc = list(J = J, burn_in = burn_in))

Explaining heterogeneity - including covariate(s)


Predicted state dynamics


Predicted state dynamics


Inferred state sequence - vit_mHMM()

state_seq_3st <- vit_mHMM(out_3st_cov, s_data = MHMM_nonverbal)


Example II: isolating CAB crisis states

Example II: isolating CAB crisis states

Four CAB based crisis states





Personalized dynamics and trajectories

R package wrap up

  • mHMMbayes fits multilevel hidden Markov models using Bayesian estimation

  • Allows for heterogeneity between subjects, while estimating one overall HMM

  • The model accommodates multivariate data

  • The model accommodates individual level covariates

  • The package also includes

    • various automated plotting functions
    • function to simulate data
    • function to obtain the most likely hidden state sequence for each individual using the Viterbi algorithm.

Vignettes: tutorial

Vignettes: estimation methods

Thank you

Determining the number of hidden states

Table 1. Model fit and convergence

2 state model 3 state model 4 state model
AIC 4759 4460 4310
Mean Rhat (% > 1.2)
State transition parameters 1.03 (0%) 1.11 (0%) 1.15 (25%)
State composition parameters 1.05 (4%) 1.11 (5%) 1.17 (29%)

How much data?

Use multivariate data! Fixed (group-level parameters) only:

  • low complexity:
    1. 400 obs for 2 dep var on 5 subjects
    2. 800 obs for 1 dep var on 5 subjects
    3. 400 obs for 1 dep var in 30 subjects
  • high complexity:
    1. 800 obs for 4 dep var on 5 subjects
    2. 1600 obs for 2 dep var on 5 subjects
    3. 1600 obs for 1 dep var on 30 subjects

Heterogeneity between subjects: focus on number of subjects, at least 30 subjects and 2 dep var.

mHMM Metropolis within Gibbs sampler

  1. Obtain forward probabilities \(\alpha_{t(ni)} = Pr\big(O_1 = o_1, ..., O_t = o_t, S_t = i\big)\) for each subject using current values of subject specific model parameters.
  2. Sample hidden state sequences in backward manner using \(\alpha_{t(ni)}\) and \(\gamma_{nij}\).
  3. Given current subject-specific parameters, draw new group-level parameter estimates using Gibbs step
  4. Given
    • observed event sequence for each subject
    • sampled hidden state sequence for each subject
    • group-level parameter distributions
    • draw new subject-specific parameters using Random Walk (RW) Metropolis sampler
  5. Repeat step 1-4 a very large number of times

Model output - posterior densities

plot(out_3st, component = "emiss", dep = 1, col = Voc_col, 
     dep_lab = c("Patient vocalizing"), cat_lab = Voc_lab)

Model output - posterior densities

plot(out_3st, component = "emiss", dep = 4, col = Voc_col, 
     dep_lab = c("Therapist vocalizing"), cat_lab = Voc_lab)

Estimation of the HMM - Bayesian

Advantages:

  • Bayesian methods yield many additional metrics, such as standard errors and credibility intervals, much harder to extract using frequentist methods.
  • Easy to handle missing data.
  • Allowing incorporation of prior knowledge (through prior distribution specification) accommodates smaller sample sizes.
  • Adds flexibility, making it relatively easy to extent to more complex models.